home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CD School House 10
/
CD School House - Education and Games (10.0) - Wayzata Technology (1995).iso
/
mac
/
DOS
/
TEACHAID
/
PAR
/
POM.INF
< prev
next >
Wrap
Text File
|
1994-05-19
|
29KB
|
740 lines
===========================================================================
===========================================================================
============================ ============================
============================ ============================
============================ PARSE-O-MATIC ============================
============================ ============================
============================ ============================
===========================================================================
===========================================================================
Parse-O-Matic is Copyright (C) 1992, 1994
by
Pinnacle Software, CP 386 Mount Royal, Quebec, Canada H3P 3C6
U.S. Office: Box 714 Airport Road, Swanton, Vermont 05488 USA
Support Line (514) 345-9578 --- Free Files BBS (514) 345-8654
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This is a SHAREWARE product. That means we would like you to
pass around unregistered copies to other people. If you have
a modem, please upload it to your favourite bulletin board
system, or give a copy to a friend whom you think might need
a program like this. Shareware means sharing! Pass it on!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
===========================================================================
INTRODUCTION
===========================================================================
--------------------------
WHY YOU NEED PARSE-O-MATIC
--------------------------
There are plenty of programs out there that have valuable data locked away
inside them. How do you get that data OUT of one program and into another
one?
Some programs provide a feature which "exports" a file into some kind of
generic format. Perhaps the most popular of these formats is known as a
"comma-delimited file", which is a text file in which each data field is
separated by a comma. Literal strings -- which might themselves contain
commas -- are surrounded by double quotes. So a few lines from a
comma-delimited file might look something like this (an export from a
hypothetical database of people who owe your company money):
+--------------------------------------------------------------------+
| |
| "JONES","FRED","1234 GREEN AVENUE", "KANSAS CITY", "MO",293.64 |
| "SMITH","JOHN","2343 OAK STREET","NEW YORK","NY",22.50 |
| "WILLIAMS","JOSEPH","23 GARDEN CRESCENT","TORONTO","ON",16.99 |
| |
+--------------------------------------------------------------------+
Unfortunately, not all programs export or import data in this format.
Even more frustrating is a program that exports data in a format that is
ALMOST what you need!
If that's the case, you might decide to spend a few hours in a text editor,
modifying the export file so that the other program can understand it. Or
you might write a program to do the editing for you. Both solutions are
time-consuming.
An even more challenging problem arises when a program which has no export
capability does have the ability to "print" reports to a file. You can
write a program to read these files and convert them to something you can
use, but this can be a LOT of work!
----------------------------
PARSE-O-MATIC TO THE RESCUE!
----------------------------
Parse-O-Matic is a utility that interprets text and fixed-length files and
converts them to other formats. It can help you "boil down" reports into
their essential data. You can also use it to convert NEARLY compatible
file formats.
------------
HOW IT WORKS
------------
You need three things:
1) The Parse-O-Matic program
2) A Parse-O-Matic "POM" file (to tell Parse-O-Matic what to do)
3) The input file
The input file is usually a report from another program. We've provided
several examples of typical input files. For example, the file
EXAMPLE2.TXT comes from the AccPac accounting software. AccPac is a great
program, but its export capabilities leave something to be desired.
Parse-O-Matic can help!
===========================================================================
FUNDAMENTALS
===========================================================================
This documentation assumes that you are an experienced computer user. If
you have trouble, you might ask a programmer to help you -- POM file
creation is a little like programming!
-------------------------
THE PARSE-O-MATIC COMMAND
-------------------------
The format of the Parse-O-Matic command line is:
POM pom-file input-file output-file
Here's an example, as you would type it at the DOS command line:
POM POMFILE.POM REPORT.TXT OUTPUT.TXT
For a more formal description of the command line, start up POM by typing
this command at the DOS prompt:
POM
------------
THE POM FILE
------------
The POM file is a text file with a .POM extension. The following
conventions are used when interpreting the POM file:
- Null lines and lines starting with a semi-colon (comments) are ignored.
- A POM file may contain up to 500 lines of specifications.
Comment lines do not count in this total.
A POM file contains no "loops" (to use the programming term). Each line of
the input file is processed by the entire POM file. If you'd like it
expressed in terms of programming languages, here's what POM does:
+----------------------------------------------------------------+
| START: If there's nothing left in the input file, go to QUIT. |
| Read a line from the input file |
| Do everything in the POM file |
| Go to START |
| QUIT: Tell the user you're finished! |
+----------------------------------------------------------------+
----------------------------------
A CODING TIP: PADDING FOR CLARITY
----------------------------------
Spaces and tabs between the words and variables in a POM file line are
generally ignored (except in the case of the OUT and OUTEND commands). You
can use spaces to make your POM files easier to read.
Additionally, in any line in the POM file, the following terms are ignored:
= THEN ELSE
These can be added to make the lines easier to read. For example, the IF
command can be written in any of the following ways:
Very terse: IF PRICE "0.00" BONUS "0.00" "1.00"
Padded with spaces: IF PRICE "0.00" BONUS "0.00" "1.00"
Fully padded: IF PRICE = "0.00" THEN BONUS = "0.00" ELSE "1.00"
===========================================================================
COMMAND WORDS
===========================================================================
For ease of learning, the commands words are explained in the following
order:
+-------------------------------------------------------------------------+
| |
| COMMANDS WHICH WILL... LIST OF COMMANDS |
| ---------------------------------- ---------------------- |
| Break up an input line into fields SET IF |
| Generate output OUT OUTEND |
| Accept or reject input MINLEN IGNORE ACCEPT |
| Alter fields TRIM PAD INSERT CHANGE |
| Preprocess input SPLIT CHOP |
| |
+-------------------------------------------------------------------------+
Here is a quick-reference table of all the commands:
------------------------------------------- ------------------------------
COMMAND FORMATS EXAMPLE
=========================================== ==============================
SET var1 value1 SET NAME $FLINE[20 26]
IF value1 value2 var1 value3 [value4] IF X = "Y" THEN Z = "N"
------------------------------------------- ------------------------------
OUT value1 value2 |output-picture OUT "X" "X" |{PRICE}
OUTEND value1 value2 |output-picture OUTEND "X" "X" |{$FLINE}
------------------------------------------- ------------------------------
MINLEN number MINLEN "15"
IGNORE value1 value2 IGNORE PRICE "0.00"
ACCEPT value1 value2 ACCEPT $FLINE[1 3] "YES"
------------------------------------------- ------------------------------
TRIM var1 spec1 character TRIM PRICE "R" "$"
PAD var1 spec1 character len PAD SERIALNUM "L" "0" "10"
INSERT var1 spec1 value1 INSERT PRICE "L" "$"
CHANGE var1 value1 value2 CHANGE DATE "/" "-"
------------------------------------------- ------------------------------
SPLIT from to [,from to] [...] SPLIT 1 250, 251 300
CHOP from to [,from to] [...] CHOP 1 250, 251 300
------------------------------------------- ------------------------------
The commands are explained in more detail (and in the same order) in the
following sections.
---------------
The SET Command
---------------
FORMAT: SET var1 value1
SET assigns a value to a variable. The usual reason to do this is to set a
variable from the input line (represented by the variable $FLINE) prior to
cleaning it up with TRIM. For example, if the input line looked like this:
JOHN SMITH 555-1234 322 Westchester Lane Architect
| | | | |
Column 1 Col 12 Col 22 Col 33 Col 57
then we could extract the last name from the input line with these two POM
commands:
SET NAME = $FLINE[12 21] (Sets the variable from the input line)
TRIM NAME "R" " " (Trims any spaces on the right side)
SET would first set the variable NAME to this value: "SMITH "
After the TRIM, the variable NAME would have the value: "SMITH"
You will also use SET if you plan to include a substring of $FLINE in the
output, since the OUT and OUTEND commands do not recognize substrings after
the "|" marker, only complete variables.
--------------
The IF Command
--------------
FORMAT: IF value1 value2 var1 value3 [value4]
If value1 contains value2, var1 is set to value3. Otherwise, it is set to
value4. If value4 is missing, nothing is done (i.e. var1 is not changed).
Here's an example of the IF command...
SET EARNING = $FLINE[20 26]
TRIM EARNING "A" " "
IF EARNING = "0.00" THEN BONUS = "0.00" ELSE "1.00"
This would obtain the value between columns 20 and 26, remove any spaces,
then check if it equals "0.00". If it does, the variable BONUS is set to
0.00. If not, BONUS is set to "1.00".
---------------------------
The OUT and OUTEND Commands
---------------------------
FORMAT: OUT[END] value1 value2 |output-picture
The OUT command generates output without an end-of-line (i.e. carriage
return and linefeed characters). The OUTEND command generates output and
also adds an end-of-line.
When value1 matches value2, a line is output to the output file, according
to the output picture. Within the output picture, all text is taken
literally (i.e. " is taken to mean literally that -- a quotation mark
character).
The only exception to this is variable names, which are identified by the
{ and } characters. For example, a POM file that contained the following
single line:
OUTEND "X" = "X" |{$FLINE}
would simply output every line from the input file (not very useful!).
The "X" = "X" part of the command is the comparator which controls when
output occurs; if both parts of the comparator are both forced to the same
value, output will always occur.
NOTE: For efficiency, OUT does not write immediately to the output file; it
accumulates the output until it reaches 255 characters before writing. You
must do an OUTEND command to ensure that the data is actually written. No
single OUT or OUTEND command can output more than 255 characters.
You can not use substrings after the "|" marker. Thus, the following line
is NOT legal:
OUTEND $FLINE[1 3] = "IBM" |{$FLINE[1 15]}
The correct way to code this is as follows:
SET CODE = $FLINE[1 15]
OUTEND $FLINE[1 3] = "IBM" |{CODE}
This would output the first 15 characters of any line that contains the
letters IBM in the first three positions.
------------------
The MINLEN Command
------------------
FORMAT: MINLEN number
MINLEN specifies the minimum length a line must be to be considered for
parsing. If you omit the MINLEN command, the minimum length is assumed to
be 1. That is to say, all lines longer than 1 character will be processed
and shorter lines (null lines in other words) will be ignored.
MINLEN is useful for ignoring brief information lines that clutter up a
report that you are parsing. For example, in the sample file EXAMPLE2.POM,
the MINLEN command is set to 85 to ensure that all lines shorter than 85
characters long will be ignored. This simplifies the coding considerably.
The longest allowable input line is 255 characters, unless you use the
SPLIT or CHOP command (described later).
------------------
The IGNORE Command
------------------
FORMAT: IGNORE value1 value2
When value1 contains value2, the input line is ignored and all further
processing on the input line stops. The usual format of this command is as
in this example:
IGNORE $FLINE[3 9] = "Date"
This would skip any input line that contains the word "Date" between
columns 3 and 9 ($FLINE is the line just read from the input file).
------------------
The ACCEPT Command
------------------
FORMAT: ACCEPT value1 value2
The ACCEPT command accepts the input line if value1 contains value2. For
example, if the entire POM file read as follows:
ACCEPT $FLINE[15 17] = "YES"
OUTEND "X" = "X" |{$FLINE}
then any input line that contains "YES" starting in column 15 would be sent
to the output file. All other lines would be ignored.
CLUSTERED ACCEPTS: Sometimes you have to check more than one value to see
if the input line is valid. You do this by using "clustered ACCEPTs",
which are several ACCEPT commands in a row.
Briefly stated, if you have several ACCEPTs in a row ("clustered"), they
are all processed to determine if the input line is acceptable or not. If
even one ACCEPT matches up, the line is accepted. To express this in more
detail...
When value1 contains value2, the line is accepted, and processing of the
POM file continues for that input line, even if the immediately following
ACCEPTs do NOT produce a match. After all, we've already got a match!
If value1 does NOT contain value2, Parse-O-Matic looks at the next commmand
in the POM file. If it is not another ACCEPT, the input line is ignored.
If it is another ACCEPT, maybe it will product a match! So Parse-O-Matic
moves to that command.
The following POM file uses clustered ACCEPTs to accept any line that
contains the name "FRED" or "MARY" between columns 5 and 8, or contains the
word "MEMBER" between columns 20 and 25.
SET NAME = $FLINE[5 8] (Set the variable)
ACCEPT NAME = "FRED" (Look for FRED)
ACCEPT NAME = "MARY" (Look for MARY)
ACCEPT $FLINE[20 25] = "MEMBER" (Look for MEMBER)
OUTEND "X" = "X" |{$FLINE} (Output the line if we get this far)
The following example would NOT work, however:
ACCEPT $FLINE[20 25] = "MEMBER"
SET NAME = $FLINE[5 8]
ACCEPT NAME = "FRED"
ACCEPT NAME = "MARY"
OUTEND "X" = "X" |{$FLINE}
It would not work because the ACCEPTs are not clustered; if the first
ACCEPT fails, the input line will be rejected as soon as the SET command is
encountered. The next two ACCEPTs would not be reached in such case.
----------------
The TRIM Command
----------------
FORMAT: TRIM var1 spec1 character
Removes characters from var1. This is usually used to remove blanks.
spec1 can be: A=All B=Both ends L=Left side only R = Right side only
For example:
SET PRICE = $FLINE[20 26]
TRIM PRICE "A" ","
TRIM PRICE "L" "$"
This would remove all commas from the variable "PRICE", and remove the
leading dollar sign. Thus:
If the input contained the string: "$25,783"
The first TRIM would change it to: "$25783"
The second TRIM would change it to: "25783"
---------------
The PAD Command
---------------
FORMAT: PAD var1 spec1 character len
PAD makes var1 a specified length, padded with a specified character.
spec1 is "L", "R", or "C" (Left, Right or Center)
character is the character used to pad the string
len is the desired string length
For example, if the variable ABC is set to "1234" ...
PAD ABC "L" "0" "7" left-pads it 7 characters wide with zeros ("0001234")
PAD ABC "R" " " "5" right-pads it 5 characters wide with spaces ("1234 ")
PAD ABC "C" "*" "8" would center it, 8 wide, with asterisks ("**1234**")
If the length is less than the length of the string, it is unchanged. For
example, if you set variable XYZ to "PINNACLE", then
PAD XYZ "R" " " "3"
would leave the string as-is ("PINNACLE").
Thus, PAD can not be used to shorten a string. If it was your intention to
make XYZ 3 letters long, it would be appropriate to use the SET command:
SET XYZ = XYZ[1 3]
------------------
The INSERT Command
------------------
FORMAT: INSERT var1 spec1 value1
The INSERT command inserts text on the left or right of var1, or at a
"found text" position.
spec1 is "L" or "R" (Left or Right) or a find-string (e.g. "@HELLO")
value1 is the value to be inserted
For example, if the variable ABC is set to "ParseOMatic", then
INSERT ABC "L" "Register " would set ABC to "Register ParseOMatic"
INSERT ABC "R" " is super" would set ABC to "ParseOMatic is super"
You can use a find-string to insert text at the first occurance of the text
you specify. For example:
INSERT ABC "@OMatic" "-" would set ABC to "Parse-OMatic"
If the find-string is not found, nothing is done.
------------------
The CHANGE Command
------------------
FORMAT: CHANGE var1 value1 value2
The CHANGE command replaces ALL occurances of value1 with value2. This is
more powerful than TRIM, but is not as efficient. Here is an example of
the CHANGE command in action:
SET DATE = $FLINE[31 38]
CHANGE DATE "/" "--"
If the SET command assigned DATE the value: "93/10/15"
Then the CHANGE command would convert it to: "93--10--15"
-----------------
The SPLIT Command
-----------------
FORMAT: SPLIT from-position to-position [,from-pos'n to-pos'n] [...]
The maximum length of an input line from a text file is 255 characters. If
your input file is wider than that, you must break up the file into
manageable chunks, using the SPLIT command. This command lets you specify
the way in which each input line is broken up so that it will look like
several SEPARATE lines.
For example, if your input lines were up to 300 characters wide, you could
specify:
SPLIT 1 255, 256 300
This would break up each line as if it was two lines. (If some of the
lines were less than 256 characters they would still be treated as two
lines, though the second line would be null (i.e. empty).)
You can specify up to 100 splits (use multiple SPLIT commands if
necessary). With SPLIT, Parse-O-Matic can handle input records of up to
32767 characters.
----------------
The CHOP Command
----------------
FORMAT: CHOP from-position to-position [,from-pos'n to-pos'n] [...]
The CHOP command works the same way as the SPLIT command, with one
exception: it informs Parse-O-Matic that the input is a fixed-record-
length file. In other words, it means that the input records are
distinguished by having a particular (and exact) length, rather than being
separated by end-of-line characters (Carriage Return, Linefeed) as is the
case for a standard text file.
Thus, if you have an input file containing fixed-length records, each of
which is 200 characters wide, you could specify it like this:
CHOP 1 200
If the input record is more than 255 characters, you must break it up into
smaller chunks. For example, if the input record was 300 characters wide,
you could break it up like this:
CHOP 1 250, 251 300
By using CHOP, Parse-O-Matic can handle input records up to 32767
characters wide.
===========================================================================
TERMS AND TECHNIQUES
===========================================================================
------
VALUES
------
A value can be specified in the following ways:
"text" A literal text string
VARNAME The name of a variable
VARNAME[start end] A substring of a variable
VARNAME[start] A single character
VARNAME+ Incremented variable (see explanation below)
Variable names can be up to 8 characters long. There is no distinction
between upper and lower case in the variable name. You can create up to
220 variables and literals.
Parse-O-Matic predefines several variables. They are:
$FLINE = The line just read from the file (max. length 255 characters)
$FLUPC = The line just read from the file, in uppercase
$BRL = The { character (used in OUT)
$BRR = The } character (used in OUT)
Since $FLINE has a maximum length of 255 characters, you will have to use
the SPLIT or CHOP command if your input file is wider than that.
----------
DELIMITERS
----------
If you need to specify a quotation mark, use "". For example:
IGNORE $FLINE = "He said ""Hello"" to me."
This would ignore lines containing: He said "Hello" to me.
------------------
ILLEGAL CHARACTERS
------------------
No command can contain these ASCII characters:
HEX DECIMAL NAME
--- ------- --------------------
$00 0 NULL
$0A 10 LF (Linefeed)
$0D 13 CR (Carriage Return)
Of course, LF and CR do appear at the very end of each line.
------------
INCREMENTING
------------
Only numeric incrementing is supported. Attempting to increment another
type of variable will result in an error.
- Incrementing "1" gives you "2"
- Incrementing "9" gives you "10"
The first time a variable is referenced, it has a null value. If you
increment this, it will be changed from "" (i.e. null) to "1".
-------------
LINE COUNTERS
-------------
If your input record is divided over several lines (due to its original
format or perhaps because you used the SPLIT or CHOP command), it is
helpful to set up a line counter. The following example would extract the
first six characters of the second line of input records that span three
lines (designated lines 0, 1 & 2):
IF LineCntr = "1" THEN MyField = $FLINE[1 6]
OUTEND LineCntr = "1" |{MyField}
IF LineCntr = "2" THEN LineCntr = "" ELSE LineCntr+
-------
TRACING
-------
By setting the DOS variable POM to ALL, you can generate a trace file,
named POM.TRC. This is helpful if you have trouble understanding why your
file isn't being parsed properly. But be sure to test it with a SMALL
input file; the trace is quite detailed, and it can easily generate a huge
output file.
To save space, you can specify a particular list of variables to be traced,
rather than tracing everything. For example, to trace only the variable
PRICE, enter this DOS command:
SET POM=PRICE
To trace several variables, separate the variable names by slashes, as in
this example:
SET POM=PRICE/BONUS/NAME
--------
EXAMPLES
--------
Most of these techniques are demonstrated by the examples provided with the
standard Parse-O-Matic package. To see these examples, switch to your
Parse-O-Matic directory and type START at the DOS prompt.